Introduction

Over the years the increasing changes in technology have revolutionized a myriad of daily tasks. Learning or Education is one such task which also faced a revolutionary change with the advent of Educational Technology. Educational Technology is the combined use of computer hardware, software, and educational theory and practice to facilitate e-learning. One such program is the well-known OpenCourseWare (OCW), launched jointly by MIT and Harvard Universities.

OpenCourseWare (OCW) are course lessons or content curated at universities and published for free on the web. OCW projects first appeared in the late 1990s, and after gaining traction in Europe and then the United States have become a worldwide means of delivering educational content via the Internet.

A similar initiative was launched in 2012 on an online learning platform EdX. Massachusetts Institute of Technology (MIT) and Harvard University launched open online courses on edX, a non-profit learning platform co-founded by the two institutions. Through the years gone by, new learners are eager to ask some questions about such online tools before they can delve into it themselves. Through this analysis, we intend to highlight some of the key insights and statistics that could answer most of Frequently Asked Questions.

This report walks you through three key sections :

  1. University level analytics

  2. Analysis based on enrollments and student engagement

  3. Certifications and Audit statistics

University level analytics

In this section, we look to answer some key questions through the use of visualizations and statistics. Some of the questions we aim to answer are - (1) How many courses are offered by Harvard and MIT respectively on the online platforms. (2) What distribution do these courses follow with respect to gender, domain of study, university, etc. (3) What would be the estimated proportion of engagement by learners on this platform, with the courses and content.

Fig (1.1) clearly shows the proportion of courses on EdX and OpenCourseWare by Harvard and MIT respectively. We can see that MIT provides a total of 55.52% of the course content on online learning platforms while Harvard constitutes for the remaining 44.48%.

Moving further we know that the courses can be broadly categorized into 4 domains, namely - Computer Science, STEM Courses and Others, which consists of courses like Humanities, History, Design, Social Science, etc. All these domains have course contents put up on online learning platforms. Fig (1.2) shows us the distribution of content based on the various domains.

Through Fig (1.2) we can see that the STEM courses have more course offerings than Computer Science. This could clearly be due to the fact that Engineering being a broad field, consists of various sub-domains. These various sub-domains along with Technology and Mathematics subjects are clearly a bigger proportion compared to Computer Science alone.

Fig(1.3) Shows the distribution of courses offered by Harvard and MIT University by different domains.We can clearly see that STEM courses are more popular at MIT while Humanities, History, Design, Religion courses are more popular at Harvard.

Every user can engage with a course in two different ways. You can either audit the course for free or access a paid version of the course for certification. All the courses offer these two modes. Fig(1.4) shows us the percentage of users who audited vs the percentage of users who accessed the certified version of the courses. The chart represents the percentage of enrollements by Institution.

Analysis based on enrollments and student engagement

To better understand the relationship between course participants and various attributes, we will delve into the following questions: 1.How many participants are there in each type of course (STEM and Non-STEM) and what is their gender distribution for each year from 2012 to 2016? 2.What is the total number of participants in each subject and what is their gender distribution? 3.How does the institution vary and what is the percentage of participants with advanced qualifications?

By answering these questions, we will gain a deeper understanding of the demographics of course participants and their academic backgrounds, which will help us to identify trends and patterns in education.

From Figure (2.1), it can be noted that the trend of course participation has been on the rise from 2012 to 2015. However, there is a decrease in the number of participants from 2015 to 2016, regardless of the course type and gender of the participants. It is also evident that the peak of enrollment occurred in 2015, with approximately 650,000 male participants enrolled in STEM courses. Furthermore, it can be observed that there is a disparity in gender participation, with more males participating in STEM courses, and a higher number of females enrolled in non-STEM courses.

To improve the above, a more in-depth analysis can be done to determine the reasons behind the decrease in participation from 2015 to 2016

The data presented in Figure (2.2) provides a deeper analysis of the subject areas studied by the participants. It is evident that the largest number of participants have taken Computer Science, followed by Science, Technology, Engineering, and Mathematics (STEM) and then Government, Health, and Social Science. On the other hand, the number of participants who have pursued Humanities, History, Design, Religion, and Education as their subjects is relatively lower. The number of female participants in all three subject areas is roughly the same, around 300,000, except in STEM where it is slightly lower at 267,000. This highlights the need for efforts to encourage more female participation in STEM fields.

Through Fig (2.3) and Fig (2.4) we can see a distinct change between the number of STEM and non-STEM courses offered at Harvard and MIT. We can see that HarvardX offers more Non-STEM courses while MITx offers more STEM courses.Moreover, we observe that over 50% of enrollments have some sort of higher educational qualification and have enrolled in non-STEM courses at HarvardX, while the remaining have taken STEM courses. We can also see that less than 80% of enrollments with advanced degrees at MIT have enrolled for STEM courses, and the remaining for non-STEM courses.

Certifications and Audit statistics

This section deals with student level statics. The questions we aim to answer in this section are-[1] What is the certification rate in each subject? [2] What is the average length of a subject and how long a student takes to complete the course on average. [3] How well a student performs if they actively participate in the discussions on forums.

Figure 3.1 shows the certification rate per subject offered by the two universities. We can see that computer science courses have a very low certification rate in Harvard as compared to MIT, so we can conclude that Harvard courses are tougher than the ones offered by MIT.

Government, Health and Social Science courses seem decently popular in both the universities, with a certification rate of almost 15% by students in both universities. Humanity, History, Design, Religion and Education courses however, seem more popular in Harvard than MIT, resulting in a better certification rate also. As for STEM courses, even though Harvard started offering courses almost a year later than MIT, it still manages to have a better certification rate than that of MIT.

One can simply conclude that MIT has a better Computer Science course, but Harvard has taken the edge for all other subject courses.

Figure 3.2 shows number of hours taken to finish a course versus the time it takes to get certified. Even though the total hours for a computer science course is 280 hours, the time to get the certificate is round 45 hours. But the most time consuming course is STEM, having a total time of 87 hours, taking around 78 hours to get certified. Of all, Humanities, History, Design and Religion is the least tedious course.

Figure 3.3 tries to find a relationship between course hours and number of times a student has posted in forums. The conclusion is that, higher the students interact in the forum, higher the chance they clear their doubts and learn more. Hence there is more certification rate.

As for Humanities, History, Design, Religion and Education, we have already seen that it is the least tedious course. This course also has the highest interaction rate and more certified individuals. We can conclude that, if there is less course hours, there is more certification rate.

Conclusion

From these analysis presented, we can see that many courses are offered by both MIT and Harvard. However, STEM courses are popular in MIT whereas Non-STEM courses are popular at Harvard. Some analysis on course preferences by age, gender, and highest qualification of the users are conducted, where students above age 35 prefer Non-STEM courses, at any of the two Universities, whereas students below 35 follow the popular trends. When we talk about the gender affecting the study, we can comment that women preferred Non-STEM courses over STEM courses across all years. Furthur it is observations were made to the percentage of people auditing and accessing the paid certification version and the impact of length of the course and postings in forums made to their certification. Two factors play a major role in people getting certified as posting in forums helps students clear their concepts and so the certification rate is higher. Also, if the number of course hours is less, then more people tend to get certified as compared to courses with longer hours. From the analysis and charts we can see the popularity between Harvard and MIT and how they are positioned in terms of course content and engagement by students towards these courses. This kind of analysis will help universities to analyse which courses need to be focused more on and which domain of study needs launching of new courses. This also helps to promote knowledge and attract great talent from the chronically underrepresented groups, such as women in STEM or the IT industry.

Reference-

Link to the dataset used in this report: Dataset